在这项工作中,我们解决了共同跟踪手对象姿势并从野外深度点云序列重建形状的具有挑战性,HandTrackNet,以估计框架间的手动运动。我们的HandTrackNet提出了一个新型的手姿势构成典型化模块,以简化跟踪任务,从而产生准确且稳健的手工关节跟踪。然后,我们的管道通过将预测的手关节转换为基于模板的参数手模型mano来重建全手。对于对象跟踪,我们设计了一个简单而有效的模块,该模块从第一帧估算对象SDF并执行基于优化的跟踪。最后,采用联合优化步骤执行联合手和物体推理,从而减轻了闭塞引起的歧义并进一步完善了手姿势。在训练过程中,整个管道仅看到纯粹的合成数据,这些数据与足够的变化并通过深度模拟合成,以易于概括。整个管道与概括差距有关,因此可以直接传输到真实的野外数据。我们在两个真实的手对象交互数据集上评估我们的方法,例如HO3D和DEXYCB,没有任何填充。我们的实验表明,所提出的方法显着优于先前基于深度的手和对象姿势估计和跟踪方法,以9 fps的帧速率运行。
translated by 谷歌翻译
基于深度学习的潜在表示已被广泛用于众多科学可视化应用,例如等法相似性分析,音量渲染,流场合成和数据减少,仅举几例。但是,现有的潜在表示主要以无监督的方式从原始数据生成,这使得很难合并域兴趣以控制潜在表示的大小和重建数据的质量。在本文中,我们提出了一种新颖的重要性驱动的潜在表示,以促进领域利益引导的科学数据可视化和分析。我们利用空间重要性图来代表各种科学利益,并将它们作为特征转化网络的输入来指导潜在的生成。我们通过与自动编码器一起训练的无损熵编码算法,进一步降低了潜在尺寸,从而提高了存储和存储效率。我们通过多个科学可视化应用程序的数据进行定性和定量评估我们方法产生的潜图的有效性和效率。
translated by 谷歌翻译
我们提出了VDL-Surogate,这是一种基于视图的神经网络贴属替代模型,用于集合模拟的参数空间探索,该模拟允许高分辨率可视化和用户指定的视觉映射。支持替代物的参数空间探索允许域科学家预览模拟结果,而无需运行大量计算成本的模拟。但是,受计算资源的限制,现有的替代模型可能无法产生以可视化和分析的足够分辨率的预览。为了提高计算资源的有效利用并支持高分辨率探索,我们从不同的角度进行射线铸造以收集样品并产生紧凑的潜在表示。这种潜在的编码过程降低了替代模型培训的成本,同时保持产出质量。在模型训练阶段,我们选择观点以覆盖整个观看球体,并为所选观点提供相应的VDL-Surrogate模型。在模型推理阶段,我们在先前选择的观点上预测潜在表示,并将潜在表示形式解码为数据空间。对于任何给定的观点,我们在选定的观点上对解码数据进行插值,并使用用户指定的视觉映射生成可视化。我们展示了VDL-Surogate在宇宙学和海洋模拟中的有效性和效率,并具有定量和定性评估。源代码可在\ url {https://github.com/trainsn/vdl-surrogate}上公开获得。
translated by 谷歌翻译
在发展强化学习(RL)培训系统方面取得了重大进展。过去的作品,例如Impala,Apex,Seed RL,样本工厂等,旨在改善系统的整体吞吐量。在本文中,我们试图解决RL训练系统中的常见瓶颈,即平行环境执行,这通常是整个系统中最慢的部分,但很少受到关注。通过针对RL环境的策划设计,我们改善了不同硬件设置的RL环境模拟速度,从笔记本电脑和适度的工作站到NVIDIA DGX-A100等高端机器。在高端机器上,Envpool在Atari环境上的环境执行每秒可实现100万帧,在Mujoco环境上每秒执行300万帧。在笔记本电脑上运行时,Envpool的速度是Python子过程的2.8倍。此外,在开源社区中已经证明了与现有RL培训库的极大兼容性,包括Cleanrl,RL_Games,DeepMind Acme等。最后,Envpool允许研究人员以更快的速度迭代他们的想法,并具有巨大的潜力,并具有巨大的潜力事实上的RL环境执行引擎。示例运行表明,在笔记本电脑上训练Atari Pong和Mujoco Ant只需5分钟即可。 Envpool已经在https://github.com/sail-sg/envpool上开源。
translated by 谷歌翻译
本文着重于几次NLP任务的文本数据增强。现有的数据增强算法要么使用一个小型培训集来生成新的合成数据,要么利用与任务无关的启发式规则(例如,同义词替代)或微调通用预训练的语言模型(例如GPT2)。因此,这些方法具有特定于任务的知识,并且仅限于在简单任务中为弱基线产生低质量的合成数据。为了解决这个问题,我们提出了知识混合数据增强模型(KNOWDA):使用知识混合培训(KOMT)在不同的NLP任务的混合物上预测的编码器LM。 KOMT是一种培训程序,将各种异质NLP任务的输入示例重新定义为统一的文本到文本格式,并采用不同粒度的目标,以学习生成部分或完整的样本。在KOMT的帮助下,Knowda可以隐含地将所需的特定于任务的知识从任务的混合中隐含地结合在一起,并通过一些给定的实例迅速掌握目标任务的固有综合定律。据我们所知,我们是首次尝试将任务数量扩展到多任务共同培训以进行数据扩展。广泛的实验表明,i)Knowda成功地通过少量基准的基准成功地提高了Albert和Deberta的表现,表现优于先前的最新数据增强基线; ii)KNOWDA还可以改善少数弹药任务的模型性能,这是KOMT中未包含的固定任务类型。
translated by 谷歌翻译
人工智能(AI)系统在许多领域越来越受欢迎。尽管如此,AI技术仍在开发阶段,并且需要解决许多问题。其中,需要对AI系统进行展示的可靠性,以便AI系统可以充满信心地由公众信任使用。在本文中,我们提供了AI系统可靠性的统计视角。与其他因素不同,AI系统的可靠性专注于时间尺寸。也就是说,系统可以针对预期时段执行其设计的功能。我们为AI可靠性研究引入了所谓的智能统计框架,包括五个组件:系统结构,可靠性度量,故障原因分析,可靠性评估和测试规划。我们审查了可靠性数据分析和软件可靠性的传统方法,并讨论如何为可靠性建模和AI系统进行评估来转换现有方法。我们还描述了最近的建模和分析AI可靠性和概述统计研究挑战的发展,包括分销检测,训练集,对抗攻击,模型准确性和不确定性量化的影响,以及讨论这些主题可以与AI可靠性有关,具有说明性示例。最后,我们讨论了AI可靠性评估的数据收集和测试计划以及如何提高系统设计,以获得更高的AI可靠性。本文结束了一些结论备注。
translated by 谷歌翻译
As natural language processing (NLP) for gender bias becomes a significant interdisciplinary topic, the prevalent data-driven techniques such as large-scale language models suffer from data inadequacy and biased corpus, especially for languages with insufficient resources such as Chinese. To this end, we propose a Chinese cOrpus foR Gender bIas Probing and Mitigation CORGI-PM, which contains 32.9k sentences with high-quality labels derived by following an annotation scheme specifically developed for gender bias in the Chinese context. Moreover, we address three challenges for automatic textual gender bias mitigation, which requires the models to detect, classify, and mitigate textual gender bias. We also conduct experiments with state-of-the-art language models to provide baselines. To our best knowledge, CORGI-PM is the first sentence-level Chinese corpus for gender bias probing and mitigation.
translated by 谷歌翻译
Forward-Looking Sonar (FLS) has started to gain attention in the field of near-bottom close-range underwater inspection because of its high resolution and high framerate features. Although Automatic Target Recognition (ATR) algorithms have been applied tentatively for object-searching tasks, human supervision is still indispensable, especially when involving critical areas. A clear FLS mosaic containing all suspicious information is in demand to help experts deal with tremendous perception data. However, previous work only considered that FLS is working in an ideal system configuration, which assumes an appropriate sonar imaging setup and the availability of accurate positioning data. Without those promises, the intra-frame and inter-frame artifacts will appear and degrade the quality of the final mosaic by making the information of interest invisible. In this paper, we propose a novel blending method for FLS mosaicing which can preserve interested information. A Long-Short Time Sliding Window (LST-SW) is designed to rectify the local statistics of raw sonar images. The statistics are then utilized to construct a Global Variance Map (GVM). The GVM helps to emphasize the useful information contained in images in the blending phase by classifying the informative and featureless pixels, thereby enhancing the quality of final mosaic. The method is verified using data collected in the real environment. The results show that our method can preserve more details in FLS mosaics for human inspection purposes in practice.
translated by 谷歌翻译
This paper introduced key aspects of applying Machine Learning (ML) models, improved trading strategies, and the Quasi-Reversibility Method (QRM) to optimize stock option forecasting and trading results. It presented the findings of the follow-up project of the research "Application of Convolutional Neural Networks with Quasi-Reversibility Method Results for Option Forecasting". First, the project included an application of Recurrent Neural Networks (RNN) and Long Short-Term Memory (LSTM) networks to provide a novel way of predicting stock option trends. Additionally, it examined the dependence of the ML models by evaluating the experimental method of combining multiple ML models to improve prediction results and decision-making. Lastly, two improved trading strategies and simulated investing results were presented. The Binomial Asset Pricing Model with discrete time stochastic process analysis and portfolio hedging was applied and suggested an optimized investment expectation. These results can be utilized in real-life trading strategies to optimize stock option investment results based on historical data.
translated by 谷歌翻译
Multi-modal fusion is a basic task of autonomous driving system perception, which has attracted many scholars' interest in recent years. The current multi-modal fusion methods mainly focus on camera data and LiDAR data, but pay little attention to the kinematic information provided by the bottom sensors of the vehicle, such as acceleration, vehicle speed, angle of rotation. These information are not affected by complex external scenes, so it is more robust and reliable. In this paper, we introduce the existing application fields of vehicle bottom information and the research progress of related methods, as well as the multi-modal fusion methods based on bottom information. We also introduced the relevant information of the vehicle bottom information data set in detail to facilitate the research as soon as possible. In addition, new future ideas of multi-modal fusion technology for autonomous driving tasks are proposed to promote the further utilization of vehicle bottom information.
translated by 谷歌翻译